---
title: Observability for Haystack with Opik
description: Start here to integrate Opik into your Haystack-based genai application for end-to-end LLM observability, unit testing, and optimization.
---

[Haystack](https://docs.haystack.deepset.ai/docs/intro) is an open-source framework for building production-ready LLM applications, retrieval-augmented generative pipelines and state-of-the-art search systems that work intelligently over large document collections.

In this guide, we will showcase how to integrate Opik with Haystack so that all the Haystack calls are logged as traces in Opik.

## Account Setup

[Comet](https://www.comet.com/site?from=llm&utm_source=opik&utm_medium=colab&utm_content=haystack&utm_campaign=opik) provides a hosted version of the Opik platform, [simply create an account](https://www.comet.com/signup?from=llm&utm_source=opik&utm_medium=colab&utm_content=haystack&utm_campaign=opik) and grab your API Key.

> You can also run the Opik platform locally, see the [installation guide](https://www.comet.com/docs/opik/self-host/overview/?from=llm&utm_source=opik&utm_medium=colab&utm_content=haystack&utm_campaign=opik) for more information.

Opik integrates with Haystack to log traces for all Haystack pipelines.

## Getting Started

### Installation

First, ensure you have both `opik` and `haystack-ai` installed:

```bash
pip install opik haystack-ai
```

### Configuring Opik

Configure the Opik Python SDK for your deployment type. See the [Python SDK Configuration guide](/tracing/sdk_configuration) for detailed instructions on:

- **CLI configuration**: `opik configure`
- **Code configuration**: `opik.configure()`
- **Self-hosted vs Cloud vs Enterprise** setup
- **Configuration files** and environment variables

### Configuring Haystack

In order to use Haystack, you will need to configure the OpenAI API Key. If you are using any other providers, you can replace this with the required API key. You can [find or create your OpenAI API Key in this page](https://platform.openai.com/settings/organization/api-keys).

You can set it as an environment variable:

```bash
export OPENAI_API_KEY="YOUR_API_KEY"
```

Or set it programmatically:

```python
import os
import getpass

if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")
```

## Creating the Haystack pipeline

In this example, we will create a simple pipeline that uses a prompt template to translate text to German.

To enable Opik tracing, we will:

1. Enable content tracing in Haystack by setting the environment variable `HAYSTACK_CONTENT_TRACING_ENABLED=true`
2. Add the `OpikConnector` component to the pipeline

Note: The `OpikConnector` component is a special component that will automatically log the traces of the pipeline as Opik traces, it should not be connected to any other component.

```python
import os

os.environ["HAYSTACK_CONTENT_TRACING_ENABLED"] = "true"

from haystack import Pipeline
from haystack.components.builders import ChatPromptBuilder
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage

from opik.integrations.haystack import OpikConnector

pipe = Pipeline()

# Add the OpikConnector component to the pipeline
pipe.add_component("tracer", OpikConnector("Chat example"))

# Continue building the pipeline
pipe.add_component("prompt_builder", ChatPromptBuilder())
pipe.add_component("llm", OpenAIChatGenerator(model="gpt-3.5-turbo"))

pipe.connect("prompt_builder.prompt", "llm.messages")

messages = [
    ChatMessage.from_system(
        "Always respond in German even if some input data is in other languages."
    ),
    ChatMessage.from_user("Tell me about {{location}}"),
]

response = pipe.run(
    data={
        "prompt_builder": {
            "template_variables": {"location": "Berlin"},
            "template": messages,
        }
    }
)

trace_id = response["tracer"]["trace_id"]
print(f"Trace ID: {trace_id}")
print(response["llm"]["replies"][0])
```

The trace is now logged to the Opik platform:

<Frame>
  <img src="/img/cookbook/haystack_trace_cookbook.png" />
</Frame>

## Cost Tracking

The `OpikConnector` automatically tracks token usage and cost for all supported LLM models used within Haystack pipelines.

Cost information is automatically captured and displayed in the Opik UI, including:

- Token usage details
- Cost per request based on model pricing
- Total trace cost

<Tip>
  View the complete list of supported models and providers on the [Supported Models](/tracing/supported_models) page.
</Tip>

<Tip>

In order to ensure the traces are correctly logged, make sure you set the environment variable `HAYSTACK_CONTENT_TRACING_ENABLED` to `true` before running the pipeline.

</Tip>

## Advanced usage

### Ensuring the trace is logged

By default the `OpikConnector` will flush the trace to the Opik platform after each component in a thread blocking way. As a result, you may disable flushing the data after each component by setting the `HAYSTACK_OPIK_ENFORCE_FLUSH` environent variable to `false`.

**Caution**: Disabling this feature may result in data loss if the program crashes before the data is sent to Opik. Make sure you will call the `flush()` method explicitly before the program exits:

```python
from haystack.tracing import tracer

tracer.actual_tracer.flush()
```

### Getting the trace ID

If you would like to log additional information to the trace you will need to get the trace ID. You can do this by the `tracer` key in the response of the pipeline:

```python
response = pipe.run(
    data={
        "prompt_builder": {
            "template_variables": {"location": "Berlin"},
            "template": messages,
        }
    }
)

trace_id = response["tracer"]["trace_id"]
print(f"Trace ID: {trace_id}")
```

### Updating logged traces

The `OpikConnector` returns the logged trace ID in the pipeline run response. You can use this ID to update the trace with feedback scores or other metadata:

```python
import opik

response = pipe.run(
    data={
        "prompt_builder": {
            "template_variables": {"location": "Berlin"},
            "template": messages,
        }
    }
)

# Get the trace ID from the pipeline run response
trace_id = response["tracer"]["trace_id"]

# Log the feedback score
opik_client = opik.Opik()
opik_client.log_traces_feedback_scores([
    {"id": trace_id, "name": "user-feedback", "value": 0.5}
])
```
